Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
The microbes present in the human gastrointestinal tract are regularly linked to humanhealth and disease outcomes. Thanks to technological and methodological advances in re-cent years, metagenomic sequencing data, and computational methods designed to analyzemetagenomic data, have contributed to improved understanding of the link between thehuman gut microbiome and disease. However, while numerous methods have been recentlydeveloped to extract quantitative and qualitative results from host-associated microbiomedata, improved computational tools are still needed to track microbiome dynamics withshort-read sequencing data. Previously we have proposed KOMB as ade novotool foridentifying copy number variations in metagenomes for characterizing microbial genomedynamics in response to perturbations. In this work, we present KombOver (KO), whichincludes four key contributions with respect to our previous work: (i) it scales to largemicrobiome study cohorts, (ii) it includes bothk-core andK-truss based analysis, (iii)we provide the foundation of a theoretical understanding of the relation between variousgraph-based metagenome representations, and (iv) we provide an improved user experiencewith easier-to-run code and more descriptive outputs/results. To highlight the aforemen-tioned benefits, we applied KO to nearly 1000 human microbiome samples, requiring lessthan 10 minutes and 10 GB RAM per sample to process these data. Furthermore, wehighlight how graph-based approaches such ask-core andK-truss can be informative forpinpointing microbial community dynamics within a myalgic encephalomyelitis/chronic fa-tigue syndrome (ME/CFS) cohort. KO is open source and available for download/use at:https://github.com/treangenlab/kombmore » « less
-
Abstract Tiled amplicon sequencing has served as an essential tool for tracking the spread and evolution of pathogens. Over 15 million complete SARS-CoV-2 genomes are now publicly available, most sequenced and assembled via tiled amplicon sequencing. While computational tools for tiled amplicon design exist, they require downstream manual optimization both computationally and experimentally, which is slow and costly. Here we present Olivar, a first step towards a fully automated, variant-aware design of tiled amplicons for pathogen genomes. Olivar converts each nucleotide of the target genome into a numeric risk score, capturing undesired sequence features that should be avoided. In a direct comparison with PrimalScheme, we show that Olivar has fewer mismatches overlapping with primers and predicted PCR byproducts. We also compare Olivar head-to-head with ARTIC v4.1, the most widely used primer set for SARS-CoV-2 sequencing, and show Olivar yields similar read mapping rates (~90%) and better coverage to the manually designed ARTIC v4.1 amplicons. We also evaluate Olivar on real wastewater samples and found that Olivar has up to 3-fold higher mapping rates while retaining similar coverage. In summary, Olivar automates and accelerates the generation of tiled amplicons, even in situations of high mutation frequency and/or density. Olivar is available online as a web application athttps://olivar.rice.edu and can be installed locally as a command line tool with Bioconda. Source code, installation guide, and usage are available athttps://github.com/treangenlab/Olivar.more » « less
-
Abstract As clinical testing declines, wastewater monitoring can provide crucial surveillance on the emergence of SARS-CoV-2 variant of concerns (VoCs) in communities. In this paper we present QuaID, a novel bioinformatics tool for VoC detection based on quasi-unique mutations. The benefits of QuaID are three-fold: (i) provides up to 3-week earlier VoC detection, (ii) accurate VoC detection (>95% precision on simulated benchmarks), and (iii) leverages all mutational signatures (including insertions & deletions).more » « less
-
Abstract Deep Learning (DL) has recently enabled unprecedented advances in one of the grand challenges in computational biology: the half-century-old problem of protein structure prediction. In this paper we discuss recent advances, limitations, and future perspectives of DL on five broad areas: protein structure prediction, protein function prediction, genome engineering, systems biology and data integration, and phylogenetic inference. We discuss each application area and cover the main bottlenecks of DL approaches, such as training data, problem scope, and the ability to leverage existing DL architectures in new contexts. To conclude, we provide a summary of the subject-specific and general challenges for DL across the biosciences.more » « less
-
In October 2021, 59 scientists from 14 countries and 13 U.S. states collaborated virtually in the Third Annual Baylor College of Medicine & DNANexus Structural Variation hackathon. The goal of the hackathon was to advance research on structural variants (SVs) by prototyping and iterating on open-source software. This led to nine hackathon projects focused on diverse genomics research interests, including various SV discovery and genotyping methods, SV sequence reconstruction, and clinically relevant structural variation, including SARS-CoV-2 variants. Repositories for the projects that participated in the hackathon are available at https://github.com/collaborativebioinformatics.more » « less
An official website of the United States government
